Building Multi-Marker Algorithms for Disease Prediction—The Role of Correlations Among Markers
نویسندگان
چکیده
A widely held viewpoint in the field of predictive biomarkers for disease holds that no single marker can provide high enough discrimination and that a panel of markers, combined in some type of algorithm, will be needed. Motivated by a recent study where 27 additional markers for ovarian cancer, many of which had good predictive value alone, failed to substantially increase the predictive ability of the primary marker of CA125, we explore the effect of additional markers on the area under the ROC curve (AUC). We develop a statistical model based on the multivariate normal distribution and linear algorithms and use it to explore how the magnitude and direction of statistical correlation among the markers (in diseased and in non-diseased) is critical in determining the added predictive value of additional markers. We show mathematically and empirically that if the additional marker(s) is negatively correlated with the primary marker, then it will always be able to provide increased AUC when combined with the primary marker (as compared to that obtained with the primary marker alone), even if it has little predictive ability on its own. In contrast, if the additional marker(s) is positively correlated with the primary marker, then it is unlikely to substantially increase the AUC when combined with the primary marker, even when it has good predictive ability on its own. Thus, univariate analyses alone may not be the best approach in choosing which markers to combine in a predictive panel of markers; patterns of statistical correlation should be considered in ranking top-performing biomarkers.
منابع مشابه
C-reactive protein and other markers of inflammation in hemodialysis patients
Background: Hemodialysis patients are at greater risk of cardiovascular disease. Higher than expected cardiovascular morbidity and mortality in this population has been attributed to dislipidemia as well as inflammation. The causes of inflammation in hemodialysis patients are multifactorial. Several markers were used for the detection of inflammatory reaction in patients with chronic renal dise...
متن کاملGenetic Heterogeneity of PKD1 and PKD2 Genes in Iran and Determination of the Genotype/Phenotype Correlations in Several Families with Autosomal Dominant Polycystic Kidney Disease
Autosomal dominant polycystic kidney disease (ADPKD) is the most common genetic nephropathy, which is characterized by replacement of renal parenchyma with multiple cysts. In Iran, the disease prevalence within the chronic hemodialysis patient population is approximately 8-10%. So far, three genetic loci have been identified to be responsible for ADPKD. Little information is available concernin...
متن کاملA Review of Microsatellite Marker Usage in the Assessment of Genetic Diversity of Camelus
Camels have been regarded as the desert ship and they play multi-utility role in the world. Estimation of genetic parameters is foremost step towards managing the genetic resources for their conservation and sustainable utilization. Microsatellite markers have been extensively used in cattle, sheep, goat and camels. However, genetic characterization studies on camels has been poorly recorded. T...
متن کاملAssociation analysis for traits associated with powdery mildew tolerance in barley [Hordeum vulgare L.] using AFLP markers
Association analysis is a useful method for evaluation of significant association between molecular marker and phenotype of trait. This study was performed to evaluate association between traits related with powdery mildew resistance and molecular markers. This investigation was performed using 77 barley genotypes and AFLP markers. In phenotypic evaluation, reaction of seedlings to powdery mild...
متن کاملEffect of marker density and trait heritability on the accuracy of genomic prediction over three generations
The aim of this study was to determine the effect of marker density, level of heritability, number of QTLs, and size of training set on the genomic accuracy over three generations. Thereby, a trait was simulated with heritability of 0.10, 0.25 or 0.40. For each animal, a genome with 20 chromosomes, 1 Morgan each, was simulated. Different marker densities (2000, 4000 and 6000 markers) and 400 an...
متن کامل